Booster: tunable in-place equation evaluation at rewritten subterms (--equation-max-local-steps)#4153
Conversation
…attern/ApplyEquations: local-fixpoint evaluation behind --equation-local-fixpoint iterateEquations restarts the full term traversal whenever any node changes, and within a pass a successful application's RHS is not re-evaluated at its position. Each firing therefore triggers a whole-term sweep in which every ancestor of the changed node is a novel term and parents match against unevaluated RHS redexes — a sweep multiplier on equation attempts that grows with the length of causal rewrite chains. Behind the new --equation-local-fixpoint flag (default off, for A/B measurement), the BottomUp equation pass wraps its evaluation hook: when a node is rewritten, run the LLVM pass on the result (preserving the LLVM-before-equations ordering of the global loop) and re-enter the cached recursion on it, normalizing the new subtree in place. Ancestors then see children in final form and the global loop converges in 1-2 passes instead of one per causal chain step. Whole-term snapshots no longer catch node-level oscillations, so the chain of local rewrites along the current path is tracked in a path-scoped localSteps stack (saved/restored around each local recursion, reset for nested predicate evaluation) and checked per step, throwing EquationLoop; maxIterations keeps counting global passes only, so TooManyIterations semantics and its partial result are unchanged. traverseTerm itself is untouched (it is shared with the LLVM and path-condition passes). Validation: with the flag on, a causal chain of depth 101 that exceeds the global-pass limit today normalizes in a single pass, the loopDef oscillation is still detected as EquationLoop, and standard scenarios give identical results. The flag-window test is sequenced after the iteration-limit test it would otherwise race with (the test binary runs tests in parallel). Full unit suite passes; test-retain-condition-cache passes with byte-identical output with --equation-local-fixpoint enabled. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…attern/ApplyEquations: replace the local-fixpoint flag with a per-pass budget, on by default (--equation-max-local-steps) The Bool flag forked evaluation into two strategies and left the in-place recursion unbounded (a single pass could apply arbitrarily many equations, with loop detection as the only brake). Replace it with a budget: at most maxLocalSteps in-place rewrites per traversal pass (--equation-max-local-steps, default 20, chosen on the expectation that a few applications normalize a side condition to its final truth value). On exhaustion, rewritten nodes are returned without recursion — exactly the restart-only strategy — and the global loop picks them up on the next pass. Restart-only evaluation is therefore the budget-0 special case of a single code path rather than a separate mode, the new strategy is the default, and total work is again bounded by maxIterations passes times (budget plus one application per node), with TooManyIterations and its partial result reached as before. Loop detection becomes two-layered: the path-scoped per-step check catches oscillations shorter than the budget immediately, and cycles that survive a pass boundary recur in the existing whole-term snapshots within at most one cycle period of passes (a period dividing the budget repeats the snapshot on the very next pass). Validation: bottom-up chains beyond the old one-application-per-pass limit complete with defaults; a lowered pass limit with the default budget still aborts an over-long chain with a partial result; budget 0 reproduces the old depth-100/101 success/abort boundary exactly; oscillations are detected per local step with the budget on and by whole-term snapshots with it off. Full unit suite passes; test-retain-condition-cache produces byte-identical output with the default budget. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… binding, ambiguous record updates) The --pedantic CI build (-Werror) rejected the leftover isTooManyIterations helper, unused since the iteration-limit test moved into the budget-0 window, and the EquationOptions record updates, which are ambiguous under DuplicateRecordFields because the field names are shared with EquationConfig. Construct the options explicitly instead. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…log golden for in-place evaluation In-place evaluation attempts the INT.lt hook directly at the rewritten constraint subterm (logged failure on the symbolic argument) and the later simplify pass no longer re-attempts it (cache hit). Log-trace changes only; all RPC responses are unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…efault --equation-max-local-steps to 0 (restart-only) Make the in-place budget opt-in so the default behavior is identical to the existing restart-only evaluation strategy. The unit tests now pin the legacy iteration-limit boundary at the default and exercise the in-place behaviors (deeper chains, per-step loop detection, combined bound) in an explicit budget-20 window. The test-log-simplify-json golden reverts to the upstream trace, guarding that the default changes nothing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tart loop and the --equation-max-local-steps budget The booster description's equation section now explains the restart-from-top pass structure that was previously implicit (and described as hard-coded), and how the in-place budget generalises it. Also fixes a typo (traverser) in the strategy bullets. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
jberthold
left a comment
There was a problem hiding this comment.
Results look quite good 🎉
but I would probably change a few details here, not least to make it easier to understand the code and the algorithm it implements.
| , localSteps = [] | ||
| , localStepCount = 0 |
There was a problem hiding this comment.
Do we have the invariant that length localSteps == localStepCount ?
If that is true then we might not need a separate counter.. but it might be good to have it for large maxLocalSteps instead of computing length many times.
I did not find a place where the localSteps are reset in the iteration, only the counter is reset.
EDIT: I found where the stack is popped, and the localStepCount is different from what I thought. I'd make it per level to get the mentioned invariant and make it easier to see what is going on.
There was a problem hiding this comment.
Done: 3def8ed — the counter is now restored together with the stack around the recursion, so localStepCount == length localSteps is a state invariant; the counter only exists to avoid computing length per step.
| -- each pass gets a fresh in-place rewriting budget | ||
| eqState . modify $ \s -> s{localStepCount = 0} |
There was a problem hiding this comment.
First I was a bit confused here because
a) only the counter is reset here, not the localSteps stack
b) should the counter not be reset at every level of the tree?
EDIT: After commenting on the localFixPointEval I understand that the counter may be counting evaluations of the entire tree during one iteration. I'd make it per level, though.
There was a problem hiding this comment.
Done: 3def8ed — the per-pass reset is gone; the counter is path-scoped alongside localSteps (per level, as you suggested), so the budget bounds each in-place chain's depth.
| { localSteps = t' : s.localSteps | ||
| , localStepCount = s.localStepCount + 1 | ||
| } | ||
| result <- llvmSimplify t' >>= recurse |
There was a problem hiding this comment.
Is the idea here to repeat the evaluation of t at this level of the AST ? Or all levels from the bottom up up to this level again?
The code is currently doing the latter because above, simp is passed as the recurse parameter.
There was a problem hiding this comment.
All levels, and that is deliberate: the rewrite builds a new subterm from the rule's RHS whose arguments need full evaluation before equations at this level (or above) can match decisively. The descent is cut short by the equation cache / isEvaluated at the substituted subject parts, so after the first step of a chain the re-entry costs one cache lookup per argument. Documented in the comment on localFixpointEval in 3def8ed.
There was a problem hiding this comment.
The examples I looked at would not benefit from additional simplification at just the top-level, they require stepping into the arguments of the functions that are produced on the RHS, evaluating those, and then evaluating the top-level again.
| let onEval | ||
| | direction == BottomUp = localFixpointEval simp |
There was a problem hiding this comment.
As commented in localFixpointEval, passing simp here (with BottomUp direction) means that the entire subtree from a node downwards is evaluated repeatedly. Maybe this is cut short because of an isEvaluated flag (see cached definition) so it is quick, but still a traversal.
The intention was to repeat evaluation of the current node in the AST, not re-evaluate all nodes below, so this should probably just use onEval itself for the recursion.
| let onEval | |
| | direction == BottomUp = localFixpointEval simp | |
| let onEval | |
| | direction == BottomUp = localFixpointEval onEval |
There was a problem hiding this comment.
We considered this but kept the full re-entry: re-applying at the node only would leave the RHS's new argument redexes unevaluated this pass, making matches at the node (and at ancestors) indeterminate — for function equations that is an abort, not a skip. The cache already prevents re-traversal of unchanged parts. Rationale added to the localFixpointEval comment in 3def8ed.
| eqState . modify $ \s -> | ||
| s | ||
| { localSteps = t' : s.localSteps | ||
| , localStepCount = s.localStepCount + 1 | ||
| } |
There was a problem hiding this comment.
Since st has been captured, the code does not have to use modify here. Also, the counter could be pre-evaluated to avoid a thunk (but it will be forced in each iteration anyway).
| eqState . modify $ \s -> | |
| s | |
| { localSteps = t' : s.localSteps | |
| , localStepCount = s.localStepCount + 1 | |
| } | |
| let !newCount = st.localStepCount + 1 | |
| eqState $ put | |
| st | |
| { localSteps = t' : st.localSteps | |
| , localStepCount = newCount | |
| } |
| , localStepCount = s.localStepCount + 1 | ||
| } | ||
| result <- llvmSimplify t' >>= recurse | ||
| eqState . modify $ \s -> s{localSteps = st.localSteps} |
There was a problem hiding this comment.
This line effectively pops the localSteps stack but does not reset the counter.
A subsequent call to evaluate an AST node upward of this one will inherit the counter. I think the counter could be reset to st.localStepCount after recurse as well, then it would be st.localStepCount == length st.localSteps as a state invariant.
There was a problem hiding this comment.
Done: 3def8ed — the counter is reset to st.localStepCount after recurse, establishing the invariant.
…cope the in-place budget (localStepCount == length localSteps) Per review: restore the counter together with the chain around the recursion, making the budget bound each in-place chain's depth rather than the total per pass, with localStepCount == length localSteps as a state invariant (the counter exists only to avoid computing the length per step). The per-pass reset in the global loop becomes redundant and is removed. Also use put on the captured state with a pre-forced counter instead of modify, and align the option help and docs with the per-chain budget semantics. The in-place recursion deliberately re-enters the full bottom-up traversal of the rewritten result (rather than re-applying at the rewritten node only): the rewrite builds a new subterm from the rule's RHS whose arguments need full evaluation, and the equation cache cuts the descent short at already-normal substituted subject parts. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This PR generalizes booster's equation evaluation strategy. Today, when an equation rewrites a subterm during the bottom-up simplification traversal, re-normalization happens by restarting whole-term passes from the top, so a rewrite chain advances one step per pass. This PR adds a per-pass budget,
--equation-max-local-steps N: after a rewrite, the result is re-simplified in place (LLVM evaluation plus re-entering the bottom-up traversal) for up to N steps per pass before deferring back to the global restart loop. N=0 — the default — is exactly the existing restart-only behavior as the degenerate case of the same code path, so this changes nothing out of the box; the knob lets users tune how deep in-place simplification chains may run.Changes:
--equation-max-local-steps(default 0, identical to current behavior); the budget resets every global pass.Booster.Pattern.ApplyEquations: on a rewrite in the bottom-up pass,localFixpointEvalre-enters the cached traversal in place, counting re-entries against the budget.--equation-max-iterationspasses times (budget + 1) applications per chain, with two-layer loop detection (per-step chain check inside the budget, whole-term snapshots across passes, which catch cycles longer than the budget).Validation:
test-log-simplify-jsonlog-trace golden — the default is trace-identical to master.ecrecoverloop02) regresses +40–50%, because an in-place chain completes a second nested memory write before the restart-pass concat rules flatten the first, and booster hands the resulting shape to Kore (Kore-side equation applications jump 2 980 → 26 477). The regression is an indicator of missing simplification lemmas /preserves-definednessannotations, not of the strategy.